Information Geometry Connecting Wasserstein Distance and Kullback-Leibler Divergence via the Entropy-Relaxed Transportation Problem

نویسندگان

  • Shun-ichi Amari
  • Ryo Karakida
  • Masafumi Oizumi
چکیده

Two geometrical structures have been extensively studied for a manifold of probability distributions. One is based on the Fisher information metric, which is invariant under reversible transformations of random variables, while the other is based on the Wasserstein distance of optimal transportation, which reflects the structure of the distance between random variables. Here, we propose a new information-geometrical theory that is a unified framework connecting the Wasserstein distance and Kullback-Leibler (KL) divergence. We primarily considered a discrete case consisting of n elements and studied the geometry of the probability simplex Sn−1, which is the set of all probability distributions over n elements. The Wasserstein distance was introduced in Sn−1 by the optimal transportation of commodities from distribution p to distribution q, where p, q ∈ Sn−1. We relaxed the optimal transportation by using entropy, which was introduced by Cuturi. The optimal solution was called the entropy-relaxed stochastic transportation plan. The entropy-relaxed optimal cost C(p, q) was computationally much less demanding than the original Wasserstein distance but does not define a distance because it is not minimized at p = q. To define a proper divergence while retaining the computational advantage, we first introduced a divergence function in the manifold Sn−1×Sn−1 of optimal transportation plans. We fully explored the information geometry of the manifold of the optimal transportation plans Shun-ichi Amari 2-1 Hirosawa, Wako-shi, Saitama, 351-0198, Japan Tel.: +81-48-467-9669 Fax: +81-48-467-9687 E-mail: [email protected] Ryo Karakida 2-3-26 Aomi, Koto-ku, Tokyo, 135-0064, Japan Masafumi Oizumi 2-8-10 Toranomon, Minato-ku, Tokyo, 105-0001, Japan ar X iv :1 70 9. 10 21 9v 1 [ m at h. O C ] 2 9 Se p 20 17 2 Shun-ichi Amari et al. and subsequently constructed a new one-parameter family of divergences in Sn−1 that are related to both the Wasserstein distance and the KL-divergence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Measures via Copula Functions

In applications of differential geometry to problems of parametric inference, the notion of divergence is often used to measure the separation between two parametric densities. Among them, in this paper, we will verify measures such as Kullback-Leibler information, J-divergence, Hellinger distance, -Divergence, … and so on. Properties and results related to distance between probability d...

متن کامل

The Cramer Distance as a Solution to Biased Wasserstein Gradients

The Wasserstein probability metric has received much attention from the machine learning community. Unlike the Kullback-Leibler divergence, which strictly measures change in probability, the Wasserstein metric reflects the underlying geometry between outcomes. The value of being sensitive to this geometry has been demonstrated, among others, in ordinal regression and generative modelling. In th...

متن کامل

Relaxed Wasserstein with Applications to GANs

We propose a novel class of statistical divergences called Relaxed Wasserstein (RW) divergence. RW divergence generalizes Wasserstein distance and is parametrized by strictly convex, differentiable functions. We establish for RW several key probabilistic properties, which are critical for the success of Wasserstein distances. In particular, we show that RW is dominated by Total Variation (TV) a...

متن کامل

Model Confidence Set Based on Kullback-Leibler Divergence Distance

Consider the problem of estimating true density, h(.) based upon a random sample X1,…, Xn. In general, h(.)is approximated using an appropriate in some sense, see below) model fƟ(x). This article using Vuong's (1989) test along with a collection of k(> 2) non-nested models constructs a set of appropriate models, say model confidence set, for unknown model h(.).Application of such confide...

متن کامل

Minimization Problems Based on a Parametric Family of Relative Entropies I: Forward Projection

Minimization problems with respect to a one-parameter family of generalized relative entropies are studied. These relative entropies, which we term relative α-entropies (denoted Iα), arise as redundancies under mismatched compression when cumulants of compressed lengths are considered instead of expected compressed lengths. These parametric relative entropies are a generalization of the usual r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1709.10219  شماره 

صفحات  -

تاریخ انتشار 2017